Mini Challenge 3: Cell Phone Calls

Authors and Affiliations:

Jason Payne, Palantir Technologies, jpayne@palantirtech.com [PRIMARY contact]

Ravi Sankar, Palantir Technologies

Matt Grimm, Palantir Technologies

Jake Solomon, Palantir Technologies


Student Team: NO

Tool(s):

For the VAST competition, the analyses were performed primarily in the Palantir Government platform and to a lesser extent in GoogleEarth and the Palantir Finance platform. Both Palantir platforms are being developed by Palantir Technologies, based in Palo Alto, California. Palantir Technologies was founded in 2004 and works with customers across the Intelligence and Finance Communities.

The development team at Palantir made the decision early in the company's history to develop an analytic platform based on a foundation of openness; a trait not often seen in the intelligence community. As old institutions transition into a world where information is increasingly a commodity, the archaic paradigms of locking down knowledge are giving way to an environment where analysis is the real power. Palantir Technologies is able to liberate this power in several concrete ways: The first is data integration - whether structured or unstructured, Palantir provides standard and extensible interfaces for bringing information into a common environment. The second is Search and Discovery, whereby these disparate data stores can be explored as though they were one. The third is Knowledge Management in which all the knowledge that is discovered is treated like another data source so no analysis is lost. And finally, the fourth is Collaboration whereby many analysts working together can truly leverage their collective mind. Through our open APIs and numerous (and multiplying) extensibility points, Palantir has succeeded in creating a genuine platform for application-development and information-analysis.

Two Page Summary: YES (will be submitted before 18 Aug)



Answers:

Phone-1: What is the Catalano/Vidro social network, as reflected in the cell phone call data, at the end of the time period?

   PhoneNodes.txt

   PhoneLinks.txt



Phone-2 Characterize the changes in the Catalano/Vidro social structure over the ten day period.

Video link:

   Phone Video


Detailed Answer:

The first thing any cinema fugitive does is chuck the cell phone that investigators are bound to track. While these scenarios are fictional, the importance of call records today is not, as Palantir’s customers regularly deal with massive amounts of SIGINT (signal intelligence). Through the Intermediary Framework, Palantir allows phone calls to be viewed either as independent events or as links between entities. It is not an either/or choice—both states exist simultaneously and can be viewed depending on which one is more appropriate to the analytic task at hand (figures 1.1 and 1.2).

intermediary framework-linkintermediary framework-event

Figures 1.1 and 1.2: Phone calls as a link (left) and as distinct events (right)

 

            Our team imported the records provided as phone calls with references to both phones involved, the time of the call, and the cell tower that originated the call. We also added geo-coordinates to the cell towers, allowing us to view the origin of the calls geospatially (figure 2).

Google Earth Overview

Figure 2: All the calls in 200’s 2nd-order network

 

Given medium confidence that ID 200 is Paraiso leader Ferdinando Catalano, we decided to consider it a starting premise that we would attempt to verify later. We performed egocentric social network analysis on ID 200 by pulling all first-order connections into the Graph View (anyone directly linked to Catalano by a phone call) (figure 3).

1st order

Figure 3: 200’s immediate network (line thickness denotes number of links)

 

An organization head often only communicates directly with his elites, and the immediate network revealed is indeed very tight. The member-node who has talked most often (14 calls) to 200 is number 5, and our intelligence suggested that brother Esteban Catalano was most likely to hold this position. We also knew that David Vidro is Catalano’s deputy and, therefore, expected him to have the second greatest number of communications. From this, we established that David most likely possesses ID 1. Palantir’s histogram, which provides high-level overviews of data selections and their commonalities, made drawing these conclusions relatively straightforward (figure 4). ID numbers 2 and 3 are directly linked to David, so we suspect that they belong to Juan and Jorge Vidro. Unfortunately, we do not have sufficient information to distinguish between these two brothers.

1st order histogram 

Figure 4: 1st-order network histogram      

 

            The next step in the investigation was an expansion to second-order connections (anyone connected to Catalano or one of his connections - figure 5).

2nd order with timeline

Figure 5: 2nd-order network

 

Pulling up our Timeline View immediately revealed two details about the phone call record: a cyclical rise/fall of calls and a sudden drop in traffic during the last three days. The daily pattern is intuitive, as calls are less frequent during the night. The network silence starting on Thursday, however, does not have a similarly apparent explanation. The previous Thursday, Friday, and Saturday had high levels of traffic, but at the end of the time period, IDs 1, 2, 3, and 5 were barely active in the inner network, and most of the few calls that were being made passed through ID 137. Potential explanations include the movement keeping a low profile, a sudden shift in the power structure, or—most likely—a decision to change cell phones.

all 400, auto layout

Figure 6.1: All 400 callers, with an Auto-Layout (this jumbled layout means they are all calling each other)

 

overall timeline for all 400

Figure 6.2: The timeline of the total network (7.1)

 

            At this point, we needed to compare our proposed inner-circle to the overall network. We opened all 400 people and searched for all links between them. The results were important on several counts (figures 6.1 and 6.2): first, ID 1 had by far the largest number of calls with #5 in second place and #200 quite low on the list. Those positions confirmed our initial ID assignments, as brother Esteban (5) should be replaced by deputy David (1) in the top position on the macro-scale and leader Ferdinando (200) should not be making very many calls overall. The second item we noticed from the overall network view was that there was no drop in traffic during the last several days, bolstering the idea that the inner-network simply changed phones.


            To check this hypothesis, we reverted with the History tool to all 400 entities without connections between them. We then performed a “Search Around,” (figure 7.1) but only looking at calls placed after Thursday. We discovered an entirely different set of key players in the Histogram here (including 309, 306, 360, 0, and 397). ID 0 had been active throughout but the others were new so we removed everyone but them from the graph. When we compared the immediate network of 309, 306, 360, and 397 to that of 1, 2, 3, and 5, we received an almost identical image (figures 7.2 and 7.3) with an inverse Timeline (figures 7.4 and 7.5).

Search Around

Figure 7.1: Search Around: find entities linked by phone calls within the specified time window

 

 autolayout, connections from 1,2,3,5autolayout, connections from 309,397,306,360

Figures 7.2 and 7.3: Compare 200’s network (left) with 300’s network (right)

 

beginning timeline

300's timeline

Figures 7.4 and 7.5: Timeline for this new network

 

ID 300, placed in the middle and connected to all four 300-series entities, was clearly Ferdinando Catalano, so we started with a fresh investigation and used our workflow from ID 200 on ID 300. This smaller subset of the records was slightly more ambiguous; ID 268 conversed by far the most with 300, suggesting he might be Esteban, but is hardly connected to anyone else in the network unlike ID 5. 309’s network very closely mirrors 1’s and uniquely connects to all the nodes in 300’s network but talks the least of anyone to 300. Ultimately, we decided that the first-order network over a 3-day period was less reliable than the second-order network—David Vidro didn’t forget what to do because he got a new cell phone. Before making the final decisions, however, we decided to check our geospatial records.    


            We viewed the links between callers as phone call events and moved all dialed calls to Google Earth because we only have the originators’ cell towers. We then compared the movements of the initial and final version of the network; unfortunately, the results were of little help for two reasons. First, in the initial network, only ID 3 frequently leaves the center of the island, so the initial patterns are fairly similar. Second, in the final network, everyone begins to move more frequently. This is potentially significant for the Movement, but leaves some guesswork in the final assignments. We then decided to compare the members of the old and new networks and realized that everyone outside the inner network had kept the same IDs. That proved 309 was formerly 1 (David), 306 was 5 (Esteban), 397 was formerly 2 (Juan or Jorge), and 360 was 3 (the remainder of Juan/Jorge), and 300 was—of course—200, or Ferdinando.